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Abstract 


Analogical comparison has been found to promote learning 
across many conceptual domains. Here, we ask whether this 
mechanism can facilitate children’s understanding of others’ 
mental states. In Experiment 1, children carried out 
comparisons between characters’ thoughts and reality and 
between characters with true beliefs vs. those with false 
beliefs. Children given this training improved from pre- to 
post-test. In Experiment 2, we used a more minimal 
comparison technique. Children saw a series of three stories 
involving true or false beliefs. There were two between- 
subjects conditions that either facilitated (High Alignability) 
or impeded (Low Alignability) comparison across stories. We 
found that children made more gains from pre- to post-test in 
the High Alignability condition than in the Low Alignability 
condition. We also found effects of production of mental state 
verbs, as assessed in an Elicitation Task. These results 
provide evidence for the role of analogical comparison in 
theory of mind development. 


Keywords: analogy; comparison; theory of mind; false 
belief; cognitive development; social cognition 


Background 


Theory of mind (ToM) refers to the ability to reason about 
the mental states of others and oneself, including desires, 
beliefs, emotions, intentions, and knowledge. Understanding 
how children arrive at this ability has been a central topic 
within cognitive science for decades. The aim of this paper 
is to elucidate the cognitive processes that contribute to this 
development. Specifically, we propose that analogical 
comparison processes contribute to ToM development. We 
describe two experiments that provide evidence for this 
claim. 

In our research, we test children on a set of standard ToM 
tasks, then expose them to comparison-based training, and 
then test them on new versions of the ToM tasks. We chose 
a set of false belief tasks as the pre- and post-tests because 
false belief understanding is considered the litmus test for 
measuring children’s ToM. The ability to pass false belief 
tasks is taken as an indication that children are acquiring a 
representational understanding of mind (Perner, 1991). 

Although some recent research suggests that some aspects 
of false belief understanding emerge very early (Leslie, 
1987; see Baillargeon, Scott, & He, 2010 for a review), 
there is considerable evidence that substantial gains in ToM 
occur between 3 and 5 years of age (Wellman, Cross, & 
Watson, 2001). Further, a comparison of different ToM 
tasks tapping into different types of mental states (e.g., 
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desires, beliefs, emotions) suggests that false belief 
understanding is part of a stable developmental trajectory of 
increasingly sophisticated reasoning about mental states 
(Wellman & Liu, 2004). Thus, it appears that children’s 
performance on false belief tasks is a good indication of a 
conceptual understanding of others’ mental states. 


Approaches to ToM Development 


What happens between 3 and 5 years of age that allows 
children to understand others’ mental states? Several 
answers to this question have been proposed. One proposal 
emphasizes the link between ToM and executive function 
(Perner & Lang, 1999). Another proposal (“theory-theory”) 
emphasizes changes in children’s theories, while a third 
proposal emphasizes the role of language. Here, we focus 
on the latter two approaches to ToM development. 

Under the theory-theory approach, children undergo a 
revision of their folk psychological theories between 3 and 5 
years of age that allows them to consider false beliefs 
(Gopnik & Wellman, 1994). Here, theory refers to 
interconnected concepts in the child’s mind that can be used 
to form predictions or expectations about the environment. 
When children are confronted with evidence that contradicts 
or cannot be explained by their current theory, they resolve 
the conflict by revising these theories to account for the new 
evidence. Theory-change is thus an experience-dependent 
process. 

Research on the influence of language on ToM has 
examined several aspects of linguistic knowledge and 
experience, including acquiring sentential complement 
syntax (de Villiers & Pyers, 2002), acquiring mental state 
verbs, and exposure to discourse (Lohmann & Tomasello, 
2003). Lohmann and Tomasello (2003) developed a training 
study in which they found that discourse and sentential 
complement syntax on their own improved false belief 
understanding. However, the greatest gains in performance 
occurred in a condition that provided children with a 
combination of discourse, sentential complement syntax, 
and mental state verbs. A meta-analysis also indicated that 
multiple elements of language contribute to false belief 
understanding (Milligan, Astington, & Dack, 2007). On this 
evidence, language provides an important set of tools 
through which children can consider others’ perspectives. 

In sum, theory-theory emphasizes the importance of 
learning from experiences, but does not explain how 
children arrive at meaningful insights from those 


experiences. And while the language account is also 
compelling, it does not specify how children combine 
language with their experiences in the world to produce 
false belief understanding. We propose that analogical 
comparison processes can help fill in these gaps. In the 
experiments reported here, we designed specific training 
experiences designed to facilitate key analogical 
comparisons and thereby provide children with a stronger 
grasp of mental states. 

Analogical comparison has been shown to be a powerful 
learning process that can reveal similarities and differences 
between entities, give rise to new inferences, and uncover 
deep relational structure (Christie & Gentner, 2010; Doumas 
& Hummel, 2013; Gentner, 1983, 2010; Gentner & 
Markman, 1997; Holyoak & Thagard, 1989). One reason to 
think that analogical processes can promote ToM is that 
false belief understanding depends on understanding key 
similarities and distinctions between representations. For 
instance, children must acknowledge that one’s mental 
contents may differ from reality, and that two people may 
hold different mental states concerning the same experience. 
Beyond identifying important commonalities and 
differences, engaging in analogical comparison may give 
rise to abstract relational structures that provide the child 
with a more general understanding of beliefs. 

The proposal that analogical processes can aid in ToM 
development has been made before (Baldwin & Saylor, 
2005; San Juan & Astington, 2012; Bach, 2014; Pham, 
Bonawitz, & Gopnik, 2012). However, empirical evidence 
on these claims is lacking. Our goal here is to test whether 
analogical processes can _ foster children’s ToM 
understanding. 


Experiment 1 


In Experiment 1, we developed a training procedure using 
comparative questioning to examine whether analogical 
comparison may aid children’s understanding of false 
beliefs. During this training procedure, we modified the 
unexpected contents task (Perner, Leekam, & Wimmer, 
1987) to allow for a comparison between characters who 
held true and false beliefs. Characters’ thoughts were 
displayed in thought bubbles so as to facilitate children’s 
comparisons across entities. Our hypothesis is that with this 
type of explicit comparative questioning, differences 
between characters’ mental states and between mental states 
and reality will become more apparent, allowing children to 
then generalize from these instances to other situations. 
Because this was a novel training approach, whether 
children could make gains in false belief tasks in a single 
session was unclear. Thus, as a first pass, we developed a 
very strong intervention, as described below. There were 
three conditions: the key Compare Thoughts condition and 
two control conditions. In the Baseline condition, children 
received no intervening training between pretest and 
posttest. In the second control condition (the Compare 
Items condition), children answered comparative questions 
(as in the key experimental condition), but these questions 
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had nothing to do with mental states. If mental comparisons 
provide children with relational knowledge about mental 
states, children should make gains solely in the Compare 
Thoughts condition. 

Wellman and Liu (2004) reported that the average age of 
children failing the false belief task was about 4 years 6 
months and the average age of children passing this task 
was about 4 years 11 months. We thus focused on the 4;6- 
to-5;0 age range since it is an age at which children may be 
especially ready to gain insight about mental states. In 
addition, given previous work showing possible gender 
differences in ToM tasks (Charman, Ruffman, & Clements, 
2002), we will also compare performance between males 
and females. 


Methods 


Participants One hundred ten 4.5- to 5-year-olds from the 
greater Evanston/Chicago area participated. The racial and 
economic composition of the sample reflected those of the 
local population, with the majority coming from European 
American, middle- and upper-middle-class families. 
Children received small gift for their participation. 

Nine children were excluded for not finishing the 

experiment, lack of engagement during experiment, or not 
understanding English. Another eighteen children (18%) 
were excluded for ceiling performance in the Pretest. A total 
of eighty-three children were included in the subsequent 
analyses (40 females, mean age 4 years 8 months). 
Materials The false belief tests were displayed on a laptop. 
Simplified images of characters and events were displayed 
in semi-animated fashion using PowerPoint. 
Procedure The experiment was run at Northwestern 
University or at the child’s preschool. Children first 
completed the diverse desires task (Wellman & Woolley, 
1990; Repacholi & Gopnik, 1997)—an easy task for 4-year- 
olds. Then children completed the Pretest, comprised of 
three different false-belief tasks. These included the change 
of location task (Wimmer & Perner, 1983; Baron-Cohen, 
Leslie, & Frith, 1985), the unexpected contents task (Perner 
et al., 1987), and a verbal false belief task (Wellman & 
Bartsch, 1989; Siegal & Beattie, 1991). In all tasks, children 
had to answer both a target and memory question correctly 
in order to pass each task. For instance, in the change of 
location task, children were asked where the character will 
look for a given object, and where the object actually is. 

Following the Pretest, children were given brief training 
on thought bubbles, adapted from Wellman, Hollander, and 
Schult (1996). All children received thought-bubbles 
training, regardless of condition; however, only children in 
the experimental condition (Compare Thoughts) saw 
thought bubbles during subsequent training. No thought 
bubbles were used in the Pretest and Posttest. 

After the thought bubbles training, children were 
randomly assigned to one of three training conditions: 
Compare Thoughts, Compare Items, or Baseline. In the 
Compare Thoughts condition, children saw two boxes and 
two characters involved in an unexpected contents situation. 


In the classic version of the task, children are shown a box 
that appears to contain one thing but contains something 
different. After the child is shown the box’s true contents, 
they are introduced to a character who has never seen inside 
the box, and asked what the character thinks is inside the 
box. Young children often incorrectly answer that the 
character will already know what the box contains. In our 
version, thought bubbles displayed what the character 
thought was inside the box. This allowed us to ask children 
to compare mental states as well as states of the world. 

Children initially saw two cereal boxes, which opened to 
reveal that one contained cereal and the other did not. Then 
the boxes were closed and two characters were introduced. 
Thought bubbles showed that each character thought his box 
contained cereal (see Figure 1). The child was asked to 
directly compare the characters’ mental states: “Are Jay and 
Luke thinking the same or different?” Then they were asked 
to contrast the actual contents of the boxes: “Do the boxes 
contain the same or different things?” Next, the contents of 
the boxes were revealed to the characters. For each 
character, we asked: “Was he thinking the same or different 
than what was inside the box?” This question was intended 
to prompt the child to compare mental states with reality— 
revealing either a true belief or a false belief. Nearly all 
children answered these questions correctly. 

After this, children were presented with a new unexpected 
contents scenario, parallel to the first scenario but with new 
boxes, contents and characters. The same sequence of 
questions was repeated for this scenario. After this second 
scenario was completed, the two scenes—each with its own 
boxes and its own characters—were shown simultaneously, 
and children were asked to identify what was the same 
between the two stories: “Remember these two stories? Can 
you tell me what’s the same between these two stories?” 
The goal was to promote structural alignment between the 
situations and thereby foster noticing the common relations. 

The Compare Items condition was designed to test 
whether any gains in the experimental condition could be 
due to comparison itself. In this condition, for example, 
children were shown two characters, each of whom had 
brought various items to a picnic. The child was asked to 
make comparisons between the items. This training 
procedure had a similar number of comparison questions to 
the Compare Thoughts condition. 

The Baseline condition had no intervening task between 
the Pretest and Posttest; children went directly from the 
thought-bubbles training procedure to the Posttest. 

We predicted that children who made comparisons 
between mental states and reality and between characters’ 
mental states in the Compare Thoughts condition would 
make more gains from Pretest to Posttest than children in 
either the Compare Items or Baseline conditions. 


Results and Discussion 


A difference score was calculated for each child, subtracting 
the number of tasks the child passed in the Pretest from the 
number of tasks they passed in the Posttest. Because 
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Figure 1. The training scene shown in the Compare 
Thoughts condition of Experiment 1. 


children with perfect Pretest scores were excluded, the 
difference scores could theoretically range from -2 to 3; 
however, the actual range of scores was from -1 to 3. 

An ANOVA with difference score as the dependent 
variable and condition and gender as between-subjects 
factors showed a significant main effect of condition, 
F(2,77) = 5.30, p < .01, n? = .10. Planned comparisons 
indicated that children in the Compare Thoughts condition 
(M = .75, SD = 1.00) made more gains in false belief 
understanding than children in either the Compare Items 
condition (M = .19, SD = .68, p < .01) or the Baseline 
condition (M = .25, SD = .70, p < .01). We then compared 
these means to zero. We found that the mean gain in the 
Compare Thoughts condition was significantly greater than 
zero, t(27) = 3.95, p = .001, whereas the gains in the 
Compare Items and Baseline were not reliably greater than 
zero, (26) = 1.41, n.s., t((27) = 1.89, ns. 

Interestingly, there was also a significant main effect of 
gender, F(1,77) = 11.13, p = .001, n? = .11. Across 
condition, females (M = .675, SD = 1.00) made more gains 
from Pretest to Posttest than males (M = .14, SD = .56, p = 
.001). There was also a marginal interaction between 
condition and gender, F(2,77) = 2.93, p = .06, n° = .06. 
Bonferroni post hoc tests showed that females made more 
gains in the Compare Thoughts condition (M = 1.31, SD = 
1.11) than in the Compare Items condition (M = .46, SD = 
.78, p < .05) and the Baseline condition (M = .29, SD = .83, 
p <.01). Males did not differ in their performance across the 
three conditions; surprisingly, they showed no significant 
gains in performance in any condition, all n.s. 

Children made significant gains from Pretest to Posttest 
after making mental state comparisons. These results 
provide evidence that comparison between and among 
thoughts and states of the world can help children 
understand others’ mental states. 


Experiment 2 


Although the results of Experiment 1 provide support for 
the hypothesis that analogical comparison can facilitate 
false belief understanding, it left some open questions. First, 
the Compare Thoughts condition was extremely rich. 
Children compared mental states to states of the world, 
mental states to other mental states, and whole situations 
involving true and false mental states to each other. Clearly, 
this level of intensive comparisons is not likely to happen in 
real life. In Experiment 2, we aimed for a more naturalistic 
experience. We showed children one true/false belief story 


at a time, but varied how easy they were to compare. The 
prediction is that children will gain insight when 
comparison across the stories is easy. This approach better 
matches real life experience, in which children can and do 
spontaneously compare across similar instances if they are 
not too distant in time. 

Another concern is that children in the Compare Thoughts 
training received more exposure to mental states than those 
in the other conditions. In Experiment 2, we equated 
exposure to thought-bubbles and mental state depictions. 
We varied only the ease with which children could compare 
across instances. If we see more gains when comparison 
across instances is facilitated, this will provide evidence that 
comparison can support false belief understanding. 

Finally, to test the possibility that gains in this task could 
also be related to children’s command of mental state 
language, we included a story-telling task in which we 
measured children’s production of mental state verbs. We 
predicted that children who produced mental state verbs 
would benefit more from training than those who did not. 

In Experiment 2, we again used a Pretest-Training- 
Posttest structure. The training was again focused on the 
unexpected contents task. Our goal was to increase 
children’s sensitivity to the match (or nonmatch) between 
mental expectations and reality. To do so, we adapted 
Loewenstein and Heath’s (2009) repetition-break pattern, in 
which two parallel (and readily alignable) situations are 
presented sequentially, followed by another (alignable) 
situation that differs in an important way. The idea is that 
the alignment between the first two situations renders their 
common structure salient, so that the learner readily notices 
the change in the last scenario. 

In our procedure, children saw a series of three stories. In 
each story, a character looked at a box—for example, a 
crayon box—and a thought bubble appeared with the 
character’s belief about its contents (e.g., crayons). Then the 
contents of the box were revealed. In the first two stories, 
the character’s guess was correct (True Belief; TB). In the 
third story, the character’s belief was shown to be incorrect 
(False Belief; FB). If children can align the first two stories, 
the contrast between TB and FB should stand out. 

There were two conditions that varied the predicted ease 
of alignment across the stories. In the High Alignability 
(HA) condition, the three stories were similar in characters 
and objects; this should facilitate aligning the two stories 
and noting their common structure. The Low Alignability 
(LA) condition showed the same sequence (two TB and then 
a FB story), but the characters and objects differed across 
the stories, making it harder for children to align the stories. 
Thus, we predicted that children in the HA condition would 
show more gains than those in the LA condition. 

In addition to equating exposure to mental states, this 
simpler method was intended to reduce demands on 
attention. For each scenario, children attended to a single 
character and container, and there were fewer questions. 
Because the procedure was less demanding, we extended the 
age range to the whole 4-5 period. 
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Methods 


Participants A total of 137 4- to 5-year-olds were recruited 
from the greater Evanston/Chicago area. The demographic 
make-up was similar to that of Experiment 1. 

Seven children were excluded for bringing a distracting 

toy into the testing area, not answering questions during the 
study, or experimenter error. Another 50 children were 
excluded for ceiling performance in the Pretest (38%). A 
total of 80 children were included in the subsequent 
analyses (38 females, mean age 4 years 6 months). 
Materials The false belief Pretest and Posttest were 
identical to those of Experiment 1, except that we included 
an extra task: a story-telling task in which two brothers 
engaged in deception. This was used to measure children’s 
production of mental state words. 
Procedure The overall procedure was similar to that of 
Experiment 1. After completing the diverse desires warmup 
task, children completed the story-telling task. Their 
utterances were transcribed and we coded whether children 
used mental state verbs to describe the scenes. 

Following the story-telling task, all children completed 
the Pretest, followed by the thought bubbles training 
procedure. Then children were randomly assigned to either 
the HA or the LA condition. In both conditions, children 
saw three stories presented sequentially: two TB stories 
followed by a FB story. Specifically, the first two stories 
showed ‘expected contents’ situations; the third showed the 
classic ‘unexpected contents’ situation. 

In each of the three stories, children saw a box with 
obvious contents (such as crayons) and a character who had 
not yet seen inside the box. A thought bubble appeared, 
depicting the character’s belief about the contents of the 
box. Then we revealed the contents of the box. The child 
was then asked “Was she right?”—that is, did the 
character’s thought bubble match reality. The child’s answer 
was confirmed by the experimenter, and then they moved 
onto the next story. The idea was to elicit a comparison 
between the character’s mental belief and the true contents 
of the box. The first two stories depicted TB, the characters’ 
predictions were right. The third story depicted an FB: the 
character’s prediction was wrong. The idea was that if the 
child had successfully aligned the first two TB scenarios, 
then the contrast with the FB scenario in the third story 
should be highly salient. 

We manipulated the alignability of the stories in two 
ways: (1) the characters and objects were highly similar in 
the HA condition and much less similar in the LA condition; 
(2) the same mental verb “think” was used to describe each 
story in the HA condition; in the LA condition, “think” was 
used in stories | and 3 and “believe” was used in story 2. 
These dissimilarities were predicted to make alignment 
more difficult in the LA condition. Thus we predicted that 
the HA group would be more likely to align the first two 
stories and extract their common relational structure, and 
therefore to notice the difference between TB and FB. 


Results & Discussion 


A difference score (gain) was calculated for each child, 
subtracting the number of FB tasks passed in the Pretest 
from the number of FB tasks passed in the Posttest. These 
scores ranged from -2 to 3. For the story-telling task, we 
measured whether children produced a single mental state 
verb (want, believe, or know). 

An ANOVA with gain as the dependent variable and 
condition, gender, and mental state language as between- 
subjects factors revealed a main effect of condition, F(1,72) 

3.50, p = .03, n? = .05. Bonferroni post-hoc tests indicated 
that children in the HA condition (M = .75, SD = .84) made 
more gains from Pretest to Posttest than children in the LA 
condition (M = .29, SD = .93). We also compared these 
means to zero. We found that the gains in both the HA and 
LA conditions were significantly above zero, t(39) = 5.65, p 
< .001, t(40) = 2.02, p = .05, respectively. 

We did not find a significant difference in gains between 
children who produced mental state language and those that 
didn’t, F(1,72) = 2.31, p = .13, n* = .03, nor was there a 
significant interaction between condition and mental state 
language, F(1,72) = 1.68, p = .20, n? = .02. However, when 
we compared these means to zero, we found an effect in the 
LA condition: only children who produced mental state 
language made significant gains (M = .60, SD = .68), t(19) = 
3.94, p = .001. Children in the LA condition who did not 
produce mental state language did not make gains (M = 0, 
SD = 1.05), t(20) = 0.00, p = 1.00. This difference did not 
hold for the HA condition, who showed significant gains 
whether they produced (M = .77, SD = .90) or did not 
produce (M = .73, SD .83) mental state language. 
Controlling for language, we found a marginal interaction 
between gender and condition, F(1,72) = 2.13, p = .09, n? = 
.03. Bonferroni post hoc tests revealed that while females 
made similar gains in both the HA and LA conditions, 
males made more gains from the HA condition than the LA 
condition (p < .01). 

As predicted, we found that children made more gains 
from the HA condition than the LA condition. It appears 
that sequential comparison of alignable situations can 
increase children’s insight into mental states. 


General Discussion 


Theories of ToM development have not typically considered 
analogical processes as important to children’s developing 
understanding of others’ minds. Here we provide evidence 
that these processes can be a route to understanding mental 
states. In Experiment 1, asking children to explicitly 
compare across mental states and between mental states and 
states of the world allowed them to see what was similar or 
different between these elements across different characters. 
Children who received this training showed gains on false 
belief tasks. In Experiment 2, we used a more naturalistic 
procedure. Both groups of children received three stories 
depicting mental states (TB, TB, and FB). But we varied 
how easy it was for children to compare across instances by 
varying their alignability. When comparison across stories 
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was easy (HA condition), children made more gains in false 
belief understanding than when it was difficult (LA). These 
large difference in gains is noteworthy, given that the two 
groups received the same kinds of stores in the same order, 
varying only in the similarity of characters and objects. 

Interestingly, children who produced mental state verbs in 
the story task made gains in both conditions, whereas those 
who did not made gains only when comparison was easy 
(HA). This suggests that greater knowledge of mental states 
(as indexed, and, possibly, abetted by production of mental 
state language) may facilitate analogical comparison across 
mental state scenarios. Such an effect would be consistent 
with previous findings. Evidence suggests that less 
sophisticated learners (in this case, children who did not 
produce mental state language) require closely aligned 
situations in order to benefit from comparison; but with 
increasing domain knowledge (here, producing mental state 
language) learners can align relationally similar situations 
even when the situations lack concrete similarity (Gentner, 
2010; Kotovsky & Gentner, 1996). Thus, children who 
grasp these verbs may be in a better position to notice 
relational similarities across instances and_ extract 
underlying regularities about beliefs. 

The findings also suggest possible effects of gender in the 
ability to gain from these experiences. In Experiment 1, 
only females showed specific gains from the Compare 
Thoughts training. In Experiment 2, there was a suggestion 
that females gained from both high- and low-alignability 
comparison, while males required — high-alignability 
comparisons. Gender differences in mental state 
understanding have been reported in prior work (Charman et 
al., 2002). Future work should clarify the nature and extent 
of these differences. 

How might these kinds of analogical processes influence 
children’s ToM development in everyday life? We believe 
that the training in Experiment 2 simulated events that 
children are likely to encounter. Children spontaneously 
compare between similar situations in their everyday 
experience. We suspect that this is particularly likely when 
similar language is used across them. Evidence suggests that 
common language invites comparison (Gentner & Namy, 
1999). For instance, when children hear the same mental 
state verb used across different situations, they may seek 
commonalities across those situations (Baldwin & Saylor, 
2005). Children from the age of 2 are capable of producing 
contrastive statements that explicitly compare mental states 
(Bartsch & Wellman, 1995), such as “You like it, but I don’t 
like it.’—suggesting that children compare at least some 
aspects of mental states even at an early age. 

Of course, thought bubbles do not exist in the real world. 
Nonetheless, children can infer some aspects of mental 
states through the language and affective reactions of the 
people around them. And as children learn mental state 
verbs, they should make gains in the ability to track and 
compare other’s mental states. 

In sum, we propose that analogical comparison processes 
operating over social experiences are instrumental in 


children’s understanding of mental states and their relation 
to the factual world. 
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